\htmlhr
\chapterAndLabel{Signature String Checker for string representations of types}{signature-checker}

The Signature String Checker, or Signature Checker for short, verifies that
string representations of types and signatures are used correctly.

Java defines multiple different string representations for types (see
Section~\ref{signature-annotations}), and it is easy to
misuse them or to miss bugs during testing.  Using the wrong string format
leads to a run-time exception or an incorrect result.  This is a particular
problem for fully qualified and binary names, which are nearly the same ---
they differ only for nested classes and arrays.

The paper ``Building and using pluggable
type-checkers''~\cite{DietlDEMS2011} (ICSE 2011,
\myurl{https://homes.cs.washington.edu/~mernst/pubs/pluggable-checkers-icse2011.pdf})
describes case studies of the Signature String Checker.


\sectionAndLabel{Signature annotations}{signature-annotations}

Java defines six formats for the string representation of a type.
There is an annotation for each of these representations.
Figure~\ref{fig-signature-hierarchy} shows how they are related;
examples appear in a table below.

\begin{figure}
\includeimage{signature-types}{7cm}
\caption{Partial type hierarchy for the Signature type system, showing
  string representations of a Java type.
  The type qualifiers are applicable to \<CharSequence> and its subtypes.
  Programmers usually only need to write
  the boldfaced qualifiers.  The other qualifiers (and some not shown) are
  included to improve the internal handling of String literals.}
\label{fig-signature-hierarchy}
\end{figure}

\label{signature-annotations-descriptions}

\begin{description}

\item[\refqualclass{checker/signature/qual}{FullyQualifiedName}]
  A \emph{fully qualified name} (\href{https://docs.oracle.com/javase/specs/jls/se17/html/jls-6.html#jls-6.7}{JLS \S
    6.7}), such as
  \<mypackage.Outer.Inner>, is used in Java code and in messages to
  the user.

\item[\refqualclass{checker/signature/qual}{ClassGetName}]
\begin{sloppypar}
  The type representation used by the
  \sunjavadoc{java.base/java/lang/Class.html\#getName()}{\code{Class.getName()}}, \<Class.forName(String)>,
  and \<Class.forName(String, boolean, ClassLoader)> methods.  This format
  is:  for any non-array type, the binary name; and for any array type, a
  format like the
  \href{https://docs.oracle.com/javase/specs/jvms/se17/html/jvms-4.html#jvms-4.3.2}{FieldDescriptor
    field descriptor}, but using
  ``\<.>''~where the field descriptor uses ``\</>''.  See examples below.
\end{sloppypar}

\item[\refqualclass{checker/signature/qual}{FieldDescriptor}]
  A \emph{field descriptor} (\href{https://docs.oracle.com/javase/specs/jvms/se17/html/jvms-4.html#jvms-4.3.2}{JVMS \S 4.3.2}), such as
  \<Lmypackage/Outer\$Inner;>, is used in a \<.class> file's constant pool,
  for example to refer to other types.  It abbreviates primitive types and
  array types.  It uses internal form (binary names, but with \</> instead of
  \<.>; see
  \href{https://docs.oracle.com/javase/specs/jvms/se17/html/jvms-4.html#jvms-4.2.1}{JVMS
    \S 4.2}) for class names.  See examples below.

\item[\refqualclass{checker/signature/qual}{BinaryName}]
  A \emph{binary name} (\href{https://docs.oracle.com/javase/specs/jls/se17/html/jls-13.html#jls-13.1}{JLS \S 13.1}), such as
  \<mypackage.Outer\$Inner>, is
  the conceptual name of a type in its own \<.class> file.

\item[\refqualclass{checker/signature/qual}{InternalForm}]
  The \emph{internal form}
  (\href{https://docs.oracle.com/javase/specs/jvms/se17/html/jvms-4.html#jvms-4.2}{JVMS
    \S 4.2}), such as \<mypackage/Outer\$Inner>, is how a class name is
  actually represented in its own \<.class> file.  It is also known as the
  ``syntax of binary names that appear in class file structures''.  It is
  the same as the binary name, but with periods (\<.>) replaced by slashes
  (\</>).  Programmers more often use the binary name, leaving the internal
  form as a JVM implementation detail.

\item[\refqualclass{checker/signature/qual}{ClassGetSimpleName}]
  The type representation returned by the
  \sunjavadoc{java.base/java/lang/Class.html\#getSimpleName()}{\code{Class.getSimpleName()}}
  method.  This format is not required by any method in the JDK, so you
  will rarely write it in source code.  The string can be empty.  This
  is not the same as the ``simple name'' defined in
  (\href{https://docs.oracle.com/javase/specs/jls/se17/html/jls-6.html#jls-6.2}{JLS
    \S 6.2}), which is the same as
  \refqualclass{checker/signature/qual}{Identifier}.

\item[\refqualclass{checker/signature/qual}{FqBinaryName}]
  An extension of binary name format to represent primitives and arrays.
  It is like \refqualclass{checker/signature/qual}{FullyQualifiedName}, but using
  ``\<\$>'' instead of ``\<.>'' to separate nested classes from their
  enclosing classes.  For example, \<"pkg.Outer\$Inner"> or
  \<"pkg.Outer\$Inner[][]"> or \<"int[]">.

\item[\refqualclass{checker/signature/qual}{CanonicalName}]
  Syntactically identical to
  \refqualclass{checker/signature/qual}{FullyQualifiedName}, but some
  classes have multiple fully-qualified names, only one of which is
  canonical (see
  \href{https://docs.oracle.com/javase/specs/jls/se17/html/jls-6.html#jls-6.7}{JLS
    \S 6.7}).

\end{description}

Other type qualifiers are the intersection of two or more qualifiers listed
above; for example, a
\refqualclass{checker/signature/qual}{DotSeparatedIdentifiers} is a string
that is a valid fully-qualified name \emph{and} a valid binary name.  A
programmer should never or rarely use these qualifiers, and you can ignore
them as implementation details of the Signature Checker, though you might
occasionally see them in an error message.  These qualifiers exist to give
literals sufficiently precise types that they can be used in any
appropriate context.

Java also defines other string formats for a type, notably qualified names
(\href{https://docs.oracle.com/javase/specs/jls/se17/html/jls-6.html#jls-6.2}{JLS
  \S 6.2}). The Signature Checker does not include annotations for these.

\label{signature-annotations-table}

Here are examples of the supported formats:\selflink{signature-annotations-examples}
\label{signature-annotations-examples}


\newcommand{\naforanon}{\emph{n/a {\smaller for anonymous class}}}
\newcommand{\naforanonarray}{\emph{n/a {\smaller for array of anon.~class}}}
\newcommand{\naforprim}{\emph{n/a {\smaller for primitive type}}}
\newcommand{\naforarray}{\emph{n/a {\smaller for array type}}}
\newcommand{\emptystring}{\emph{\smaller (empty string)}}

\begin{small}
\begin{center}
\begin{tabular}{|l|l|l|l|l|l|}
\hline
\multicolumn{1}{|c|}{fully qualified name} & \multicolumn{1}{c|}{Class.getName} & \multicolumn{1}{c|}{field descriptor} & \multicolumn{1}{c|}{binary name} & \multicolumn{1}{c|}{internal form} & \multicolumn{1}{c|}{Class.getSimpleName} \\ \hline
int                 & int                  & I                    & \naforprim          & \naforprim          & int            \\
int[][]             & [[I                  & [[I                  & \naforarray         & \naforarray         & int[][]        \\
MyClass             & MyClass              & LMyClass;            & MyClass             & MyClass             & MyClass        \\
MyClass[]           & [LMyClass;           & [LMyClass;           & \naforarray         & \naforarray         & MyClass[]      \\
\naforanon          & MyClass\$22          & LMyClass\$22;        & MyClass\$22         & MyClass\$22         & \emptystring \\
\naforanonarray     & [LMyClass\$22;       & [LMyClass\$22;       & \naforarray         & \naforarray         & []             \\
java.lang.Integer   & java.lang.Integer    & Ljava/lang/Integer;  & java.lang.Integer   & java/lang/Integer   & Integer        \\
java.lang.Integer[] & [Ljava.lang.Integer; & [Ljava/lang/Integer; & \naforarray         & \naforarray         & Integer[]      \\
pkg.Outer.Inner     & pkg.Outer\$Inner     & Lpkg/Outer\$Inner;   & pkg.Outer\$Inner    & pkg/Outer\$Inner    & Inner          \\
pkg.Outer.Inner[]   & [Lpkg.Outer\$Inner;  & [Lpkg/Outer\$Inner;  & \naforarray         & \naforarray         & Inner[]        \\
\naforanon          & pkg.Outer\$22        & Lpkg/Outer\$22;      & pkg.Outer\$22       & pkg/Outer\$22       & \emptystring \\
\naforanonarray     & [Lpkg.Outer\$22;     & [Lpkg/Outer\$22;     & \naforarray         & \naforarray         & []             \\
\hline
\end{tabular}
\end{center}
\end{small}

Java defines one format for the string representation of a method signature:

\begin{description}

\item[\refqualclass{checker/signature/qual}{MethodDescriptor}]
  A \emph{method descriptor} (\href{https://docs.oracle.com/javase/specs/jvms/se17/html/jvms-4.html#jvms-4.3.3}{JVMS \S
    4.3.3}) identifies a method's signature (its parameter and return
  types), just as a field descriptor identifies a
  type.   The method descriptor for the method
\begin{Verbatim}
    Object mymethod(int i, double d, Thread t)
\end{Verbatim}
\noindent is
\begin{Verbatim}
    (IDLjava/lang/Thread;)Ljava/lang/Object;
\end{Verbatim}

\end{description}


\subsectionAndLabel{How to choose which annotation to use}{signature-choosing-annotation}

Sometimes, there are multiple valid annotations for a value.  As an
example, a non-primitive non-array type is represented identically by
\<@BinaryName>, \<@ClassGetName>, and \<@FqBinaryName>.  Using the lowest
type in the type hierarchy (in this case, \<@BinaryName>) has two
advantages.  First, it acts as documentation that the value is never a
primitive or array.  Second, it permits the value to be used in any of the
three contexts:  as a \<@BinaryName>, a \<@ClassGetName>, or a
\<@FqBinaryName>.

Casting to \<@BinaryName> from one of the other types adds clutter if it is not
necessary.  Suppose that a method returns a \<@ClassGetName> and the value will
only be used in contexts that require a \<@ClassGetName> (say, it is passed to a
method that requires an argument of that type).  Then there is no point in
casting to \<@BinaryName> in between, even if you know the type being
represented is not a primitive or an array.


\sectionAndLabel{What the Signature Checker checks}{signature-checks}

Certain methods in the JDK, such as \<Class.forName>, are annotated
indicating the type they require.  The Signature Checker ensures that
clients call them with the proper arguments.  The Signature Checker does
not reason about string operations such as concatenation, substring,
parsing, etc.

\begin{sloppypar}
To run the Signature Checker, supply the
\code{-processor org.checkerframework.checker.signature.SignatureChecker}
command-line option to javac.
\end{sloppypar}


% LocalWords:  Regex regex quals FullyQualifiedName BinaryName FieldDescriptor
% LocalWords:  Lpackage MyClass MethodDescriptor forName substring boolean
% LocalWords:  jls getName ClassGetName ClassLoader LMyClass Ljava jvms
% LocalWords:  DotSeparatedIdentifiers CharSequence Lmypackage mypackage
% LocalWords:  FieldDescriptorWithoutPackage InternalForm getSimpleName
% LocalWords:  ClassGetSimpleName FqBinaryName CanonicalName checkers''
% LocalWords:  Lpkg
